Goto

Collaborating Authors

 xpath query


XPath Agent: An Efficient XPath Programming Agent Based on LLM for Web Crawler

Li, Yu, Wang, Bryce, Luan, Xinyu

arXiv.org Artificial Intelligence

We present XPath Agent, a production-ready XPath programming agent specifically designed for web crawling and web GUI testing. A key feature of XPath Agent is its ability to automatically generate XPath queries from a set of sampled web pages using a single natural language query. To demonstrate its effectiveness, we benchmark XPath Agent against a state-of-the-art XPath programming agent across a range of web crawling tasks. Our results show that XPath Agent achieves comparable performance metrics while significantly reducing token usage and improving clock-time efficiency. The well-designed two-stage pipeline allows for seamless integration into existing web crawling or web GUI testing workflows, thereby saving time and effort in manual XPath query development. The source code for XPath Agent is available at https://github.com/eavae/feilian.


How to Transform your Google Spreadsheet Into an Opinion Mining Tool

@machinelearnbot

This blog was originally featured on blog.aylien.com, a Text Analysis blog with tutorials, Data Visualisations and industry discussions. Our founder, Parsa Ghaffari, gave a talk recently on Natural Language Processing and Sentiment Analysis at the Science Gallery in Dublin. As part of the talk, he put together a nice little example of how you can transform your Google Spreadsheet into a powerful Text Analysis and Data Mining tool. In this case, he took a simple example of analyzing restaurant reviews from a popular review site but the same could be done for hotels, products, service offerings and so on. He wanted to show how easy it can be for data geeks and even the less technical marketers among us, to start analyzing text and gathering business insight from the reams of textual data online today.


Node Selection Query Languages for Trees

Calvanese, Diego (Free University of Bozen-Bolzano) | Giacomo, Giuseppe De (DIS, Sapienza Universita Roma) | Lenzerini, Maurizio (DIS, Sapienza Universita Roma) | Vardi, Moshe Y. (Rice University)

AAAI Conferences

The study of node-selection query languages for (finite) trees has been a major topic in the recent research on query lan- guages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or monadic second-order logic (MSO), have been considered. Results in this area typically relate an Xpath-based language to a classical logic. What has yet to emerge is an XPath-related language that is expressive as MSO, and at the same time enjoys the computational proper- ties of XPath, which are linear query evaluation and exponen- tial query-containment test. In this paper we propose μXPath, which is the alternation-free fragment of XPath extended with fixpoint operators. Using two-way alternating automata, we show that this language does combine desired expressiveness and computational properties, placing it as an attractive can- didate as the definite query language for trees.